ethical bias
Delving into Multilingual Ethical Bias: The MSQAD with Statistical Hypothesis Tests for Large Language Models
Yu, Seunguk, Choi, Juhwan, Kim, Youngbin
Despite the recent strides in large language models, studies have underscored the existence of social biases within these systems. In this paper, we delve into the validation and comparison of the ethical biases of LLMs concerning globally discussed and potentially sensitive topics, hypothesizing that these biases may arise from language-specific distinctions. Introducing the Multilingual Sensitive Questions & Answers Dataset (MSQAD), we collected news articles from Human Rights Watch covering 17 topics, and generated socially sensitive questions along with corresponding responses in multiple languages. We scrutinized the biases of these responses across languages and topics, employing two statistical hypothesis tests. The results showed that the null hypotheses were rejected in most cases, indicating biases arising from cross-language differences. It demonstrates that ethical biases in responses are widespread across various languages, and notably, these biases were prevalent even among different LLMs. By making the proposed MSQAD openly available, we aim to facilitate future research endeavors focused on examining cross-language biases in LLMs and their variant models.
- North America > United States > Virginia (0.04)
- Asia > Malaysia (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (2 more...)
- Government (0.69)
- Law > Civil Rights & Constitutional Law (0.68)
- Education > Educational Setting (0.46)
BiasJailbreak:Analyzing Ethical Biases and Jailbreak Vulnerabilities in Large Language Models
Although large language models (LLMs) demonstrate impressive proficiency in various tasks, they present potential safety risks, such as `jailbreaks', where malicious inputs can coerce LLMs into generating harmful content bypassing safety alignments. In this paper, we delve into the ethical biases in LLMs and examine how those biases could be exploited for jailbreaks. Notably, these biases result in a jailbreaking success rate in GPT-4o models that differs by 20\% between non-binary and cisgender keywords and by 16\% between white and black keywords, even when the other parts of the prompts are identical. We introduce the concept of BiasJailbreak, highlighting the inherent risks posed by these safety-induced biases. BiasJailbreak generates biased keywords automatically by asking the target LLM itself, and utilizes the keywords to generate harmful output. Additionally, we propose an efficient defense method BiasDefense, which prevents jailbreak attempts by injecting defense prompts prior to generation. BiasDefense stands as an appealing alternative to Guard Models, such as Llama-Guard, that require additional inference cost after text generation. Our findings emphasize that ethical biases in LLMs can actually lead to generating unsafe output, and suggest a method to make the LLMs more secure and unbiased. To enable further research and improvements, we open-source our code and artifacts of BiasJailbreak, providing the community with tools to better understand and mitigate safety-induced biases in LLMs.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.67)
An "Unbiased" Guide to Bias in AI
Whenever there is any mention of ethics in the context of AI, the topic of bias & fairness often follows. Similarly, whenever there is any mention of training and testing machine learning models, the trade-off between bias & variance features heavily. But do these two mentions of bias refer to the same thing? In order for machines to learn these patterns, especially in "supervised learning", they go through a training process whereby an algorithm extracts patterns from a training dataset, typically in an iterative manner. It then tests its predictions on an unseen (out-of-sample) test dataset to validate if the patterns it had learnt from the training dataset are valid. Bias: The action of supporting or opposing a particular person or thing in an unfair way, because of allowing personal opinions to influence your judgment.
Why Is It So Hard To Build An Ethical ML Framework For Healthcare
"A disproportionate amount of power lies with research teams who, after determining the research questions." The improved methods of collecting high-quality data, coupled with advancements of machine learning models fueled a new wave of healthcare practices. From retinopathy to computer vision-based surgeries, algorithms have found their ways into critical life-saving domains. The potential is tremendous, but somehow the world is cynical about a total embrace. This is because of the many ways in which bias creeps up into data and eventually to diagnosis.